Supplementary data
This page contains supplementary data for the paper
Li Liao and William Stafford Noble. "Combining pairwise sequence similarity and support vector machines for remote protein homology detection." Proceedings of the Sixth Annual International Conference on Research in Computational Molecular Biology, April 18-21, 2002. pp. 225-232as well as for the extended journal version of the same paperLi Liao and William Stafford Noble. "Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships." Journal of Computational Biology. 10(6):857-868, 2003.The full text of each paper is available from the links above.
- ROC and median RFP scores for all families and all six homology detection methods from the conference paper in HTML and plain text formats.
- ROC and median RFP scores for all families and five homology detection methods from the journal paper in HTML and plain text formats. The differences with respect to the conference paper results are as follows:
- two additional methods were added,
- PSI-BLAST was run using CLUSTALW alignments, rather than starting from a single training set sequence,
- a bug was fixed in the FPS implementation, and
- a bug was fixed in the scoring routines that calculate ROC and RFP.
- Tab-delimited table specifying the positive and negative training and test sets for each family. Each row is one sequence, and each column is one family. (0 = not present; 1 = positive train; 2 = negative train; 3 = positive test; 4 = negative test).
- Names of the SCOP families.
- Sequence file in FASTA format containing all sequences in SCOP version 1.53 with a pairwise similarity threshold of 10-25.
- Gzipped, tab-delimited table containing Smith-Waterman p-values for all pairs of sequences (31 MB).